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Abstract 



In recent papers we considered how two large service systems that are primarily designed to operate 
independently, can help each other in face of unexpected overloads, due to a sudden change in the arrival 
rates. We proposed an overload control, which we named fixed-queue-ratio with thresholds (FQR-T), 
whose aim was to prevent any sharing of customers, i.e., sending customers from one class to be served 

■ in the other class' pool, during normal loads, and to initiate sharing automatically once a threshold 
\ is crossed, in which case the corresponding pool is considered overloaded. The goal is to keep the 

(-H ■ relation between the two queues fixed at a certain ratio, which is optimal in a deterministic "fluid" 

approximation, assuming a holding cost is incurred on the two queues. To avoid harmful sharing our 
control includes the one-way sharing rule, stipulating that sharing is allowed in only one direction at any 
time. In this paper we consider a more complex time-varying environment, in which the arrival rates and 
staffing levels are time dependent, so that the system may fluctuate between periods of various loads, with 
overloads possible in either direction. We show that FQR-T needs to be modified to account for these 
!>■ ■ more complex settings, since it may be slow to react to the changing environment, and may even cause 

sever fluctuations once the arrival rates return to normal after an overload incident. Our new control, FQR 
with activation-and- release thresholds (FQR-ART) is designed to automatically respond to changes in 
\ the environment by initiating sharing in the right direction quickly, if that is needed, while avoiding 

harmful phenomenons, such as congestion collapse and severe oscillations during normal loads. A novel 
I fluid approximation, described implicitly via an ordinary-differential equation (ODE) is developed, as 

■ well as an efficient algorithm to solve that ODE. 



1 Introduction 

Queueing systems typically experience time-varying loads and may also experience periods of congestions. 
Overload incidents are prevalent phenomenons in communication networks, healthcare systems, call centers, 
etc. In this paper we are motivated by a call-center application, but the insights and methods we develop can 
be employed in other queueing settings. 

In a typical call center, under normal circumstances, the arrival rates vary by time of day in a predictable 
way, and the staffing responds to that anticipated pattern, typically with fixed staffing levels over short time 
periods, such as half hours; see |[I] and llT3l for background. However, it is often not possible to have the 
exact staffing needed to meet pre-specified service-level constraints at all times. More specifically, there 
are time periods during the day in which the number of agents may be higher or lower than the number 
needed to provide the desired service levels for the customers in terms of, e.g., waiting times in queue. This 
can occur for several reasons such as (i) encountering arrival rates that are higher than initially forecasted; 
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(ii) the inability to have the staffing change together with the arrival rates appropriately due to human- 
resource constraints; (iii) agents absenteeism, e.g., agents not showing up to work, or taking unreported 
breaks during their shift; (iv) system malfunction, e.g., computer failure which prevents some, or all, agents 
from providing service. Such discrepancies between arrival rates and staffing, whether anticipated or not, 
can cause overloads in the system for long time periods, whose consequence is serious degradation of the 
system performance in terms of waiting times in queue, customer abandonment, etc. In addition, overloads 
can occur due to inappropriate control. 

In this paper we continue our research on how to exploit flexibility of queueing systems in some "opti- 
mal" manner in order to alleviate congestions due to having arrival rates being are higher than the system's 
service capacity. In particular, our aim is to design an automatic and easy-to-implement control, with the 
objective of efficiently routing customers between two different service pools in a time-varying environment 
encountering periods of overloads. Although having flexibility in stochastic queueing networks is generally 
perceived useful, we show that it sometimes needs to be exploited with caution in order to avoid bad per- 
formance, such as chaotic oscillations and congestion collapse; see ^below and ^1.2l for other examples in 
the literature. 

The Context. In |[25l we considered how two service systems, each consisting of a single many-agent 
service pool having a designated customer class, can help each other in face of an unexpected overload that 
is due to a sudden increase in the arrival rates. Since sharing of customers, i.e., sending customers from one 
class to be served in the other class pool, is potentially possible in the two directions, the model is known in 
the call-center literature as an X model; see Figure [T] 

In the setting of ||25]| . once an unexpected overload occurs, and only then, one of the two customer 
classes, the one deemed "more overloaded", should receive help from the other service pool in such a way 
that a specific ratio holds approximately between the two queues. That ratio is found by solving a deter- 
ministic "fluid" optimization problem, assuming a holding cost is incurred on both queues during overload 
incidents. There are two possible different target ratios - one for each direction of sharing - with rj j denot- 
ing the fluid-optimal ratio when queue i should receive help from pool j, i,j = 1, 2. Here we treat those 
two ratios as given parameters, and refer to ||25l for further details on how to compute them. 

To achieve the objectives described above we suggested the FQR-T control together with one-way shar- 
ing. The fixed-queue ratio is the (fluid-optimal) ratio discussed above, depending on the direction of over- 
load, and the thresholds are two positive numbers relating to the queues, whose purpose is to prevent sharing 
when neither pool is overloaded, and detect overloads quickly so that sharing can begin in the right direction 
once an overload incident begins. More specifically, once one of those activation thresholds is crossed, the 
system is considered to be overloaded, with the direction of the overload depending on the specific threshold 
that was crossed; we elaborate in f|2]below. The one-way sharing rule stipulates that no class-1 customers 
are allowed to be routed to pool 2 at any time t > if there are any class-2 customers in pool 1 at that time 
t, and similarly in the other direction. 

In |f26l we developed a fluid model to facilitate the analysis of the complex behavior of the X system 
under FQR-T during overloads, and performed a detailed rigorous analysis of that model in ll27l . Finally, 
in ||28]| we proved that the fluid model arises as a many-server heavy-traffic fluid limit of properly-scaled 
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sequence of stochastic X systems. Diffusion-limit refinements for tliat fluid limit were proved in ||29l . 

1.1 Contribution: Control and Fluid Analysis in Time- Varying Environment 

Building on fluid analysis, in this paper we show that FQR-T with one-way sharing is not adequate for 
time-varying environment, since it may be too slow to react to changing loads. To accelerate the response 
to switching overloads, we modify the one-way sharing rule by allowing class-i customers to be routed to 
pool j, provided that there are no more than j > class-j customers in pool j, if sharing is desired. 
We refer to ri 2 and r2,i as "release thresholds" (to be distinguished from the "activation thresholds" dis- 
cussed above). We name our modified control Fixed-Queue-Ratio with Activation and Release Thresholds, 
abbreviated FQR-ART. 

As was previously observed in numerous other settings (see ^1.2l below). overload controls may cause 
severe degradation of system's performance during normal loads if they are not executed with caution. In our 
case, if the activation thresholds in FQR-ART are not chosen carefully, a congestion collapse may occur, 
namely, the effective service rate is reduced, so that a normally-loaded system becomes overloaded. A 
congestion collapse result was also observed in ||25| and analyzed in 1261 and was due to inefficient sharing 
in both directions simultaneously. (Which is the main reason for having the one-way sharing rule.) However, 
unlike in those previous papers, here that phenomenon is due to chaotic oscillations of the service process 
that may occur when a system recovers from an overload period, and is due to the new release thresholds. 
We solve that problem by choosing larger activation thresholds; see ^below. 

Our contribution is threefold: (i) We demonstrate how and when it is beneficial (and harmful) to exploit 
the flexibility of the system, (ii) We design a control that automatically exploits that flexibility when this 
is beneficial, and prevents bad behaviors that can occur in the time-varying settings, (iii) We develop a 
novel fluid model to approximate the (untractable) stochastic system in time-varying environment. That 
fluid model is described implicitly as the solution of an ordinary differential equation (ODE), building on a 
stochastic averaging principle (AP); see ^5.2| below. Finally, we design an efficient algorithm to solve that 
ODE. 

Here is how the rest of the paper is organized. We conclude this section with a literature review. In 
^ we describe the stochastic X model and the FQR-T and FQR-ART controls. Building on simple fluid 
considerations. In §^ and |4] we demonstrate the need to modify FQR-T in order to adjust to the time- 
varying settings considered in this paper. Specifically, in f|3] we show why release thresholds are essential, 
and in f|4] we show that, unless precaution is taken, the release thresholds can cause congestion collapse 
when the system recovers from an overload. Specifically, a system which has enough potential service 
capacity becomes overloaded due to oscillatory behavior of the service process. To avoid that bad behavior, 
the activation thresholds need to be of a larger order of size than those suggested for the FQR-T control. 
Section f|5] is dedicated to a development of the fluid approximation to the system which is represented 
implicitly as a solution to an ODE. In f|6] we explain how this ODE can be solved numerically and provide 
an efficient algorithm to numerically solve that ODE. In ^we provide numerical examples, demonstrating 
the effectiveness of the fluid model by comparing the solution to the ODE's of the various cases to simulation 
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experiments. In ^we discuss stationarity and quasistationarity of the fluid model. Finally, we conclude in 
^and suggest some possible extensions and further research. 

1.2 Literature Review 

Steady-state performance measures are sometimes possible to compute using, for example, standard CTMC 
theory or transforms methods; see, e.g., |j9] and IIBTI for "exact analysis" of overload controls. If Laplace 
transforms are employed, then numerical algorithms to invert those transforms are needed; see, e.g., |@] 
for applications in related settings. However, as the models become more complicated, even numerical 
computations of steady state quantities and transforms become impractical. (For example, if the models 
are non-stationary; of high dimension; do not posses any form of reversibility; etc.) In particular, when 
overload controls are considered, it is important to analyze the time-dependent (non-stationary) evolution of 
the system under consideration in order to study its behavior when overloads begin as well as its recovery 
from overload incidents. Thus, even in the simplest Markovian settings, there is need to use approximations 
and/or simulations. Since we are concerned with the untractable behavior of a highly complex and non- 
stationary system, the following review focuses on examples of approximation methods and simulations to 
analyze complex queueing systems that are related, at least in some aspects, to our current study. 

Time- Varying Systems. There is growing interest in systems having time-dependent arrival rates and 
staffing functions. For such systems, fluid models are especially valuable from both the operational and 
analytical perspectives. From the operational aspect, fluid models are important since they approximate the 
main trends in the system's behavior. Thus, from a managerial point of view, fluid models are superior 
to diffusion approximations in non-stationary settings. From the analytical aspect, fluid models are easier 
to derive (even if it is hard to prove that they arise as bona-fide weak limits) and are much more tractable 
than approximating diffusions; see |[2T1 l22l |23l for current results on queues in time-varying environments 
experiencing periods of overloads, as well as for many more references on that subject. We do point out 
that to stabilize the system at a stationary state when it is experiencing time-varying arrival rates, we employ 
time- varying staffing functions based on the infinite-server approximation in ifTTI . See also §3 in flTl . 
(However, here we are concerned with fluid models, so we do not employ square-root staffing refinements, 
as in the latter citations, which have no effect on the fluid approximations.) 

Routing, Sclieduling and Staffing of Overloaded Systems. As staffing decisions for call centers are 
done in advance, it is often the case that the number of agents present cannot be changed in response to 
the realized arrival rates. A multiple class and pool call center model is studied in |[T6l . assuming that the 
arrival rates to the system are themselves stochastic processes (e.g., the arrival process might be a doubly- 
stochastic Poisson process). Since fluid approximations are employed to optimize the system, with the 
objective of minimizing abandonment in addition to staffing costs, the analysis in lfT6i is carried out for 
congested systems. 

In Q a system with multiple streams of arriving jobs evolving in discrete time is considered during 
a temporary overload period, where the overload in that reference is defined via rate stability (there is no 
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abandonment). The authors in ||5l suggest using the max-weight policy which, much like the FQR-ART 
control, is simple to implement since it uses only information on the current state of the system; it stabilizes 
the system during normal-loads periods; and, finally, it keeps the queues at target ratios when the system 
cannot be stabilized due to high arrival rates. 

It is significant that the max-weight policy is not order optimal for the X model, i.e., the steady state 
queues grow faster than 0{^/n) in the many-server heavy-traffic limiting regime, as shown in |34|. (A 
similar phenomenon was observed in |)25l, when FQR-T is employed with activation thresholds that are too 
small.) The authors in ||34l propose a control which they name shadow routing, and prove diffusion limits 
under a complete-resource-pooling condition for a general parallel service system. Modifications of the 
shadow-routing control, designed for overloaded parallel systems with unknown arrival rates, are studied in 
a subsequent paper [35 ]. Similar to our case, the objective of the control is to quickly and automatically 
detect overloads, and route customers in an "optimal" manner. More precisely, asymptotic optimality of 
their SHADOW-RM control is proved, where the objective is to maximize the reward rate, assuming a 
class-dependent reward of each customer served. 

In ||15J overflow networks in heavy-traffic are studied in settings of co-sourcing, i.e., when firms that 
operate their own in-house call center overflow a nonnegligible proportion of the arrivals to a call center that 
is operated by an outsourcer. Specifically, the in-house service pool has a finite buffer, or no buffer at all, for 
waiting customers (in the latter case, the in-house pool is a loss system), and customers who arrive to find 
the system full are overflowed to the outsourcer. In lITSl . "nonnegligible proportion" of overflow implies that 
the arrival rate to the in-house pool is larger than its maximum service capacity, so that the in-house pool is 
overloaded in the sense that all agents are busy all the time, asymptotically, in the many-server heavy-traffic 
limit. Fluid and diffusion limits are obtained via a stochastic AP, and an asymptotic-independence result, 
which facilitates solving related optimization problems, is shown to hold. 

Healthcare Systems. System Overloads are especially prevalent in healthcare systems, and can even be 
considered a "natural state" in many cases. For example, operation rooms, intensive care units (ICU), MRI 
machines, etc., are designed to operate continuously, 24/7, and exhibit long lines of waiting patients whose 
wait times can be in the order of weeks, months and even years. 

A hospital, as a holistic system, has extremely complex queueing dynamics, having multiple internal 
flows between its units in addition to exogenous arrival streams. Thus, overloads in some units of a hospital 
can "propagate" to other units, creating a system-wide overload. For example, when inpatient wards (IW) 
are overloaded, patients from the emergency department (ED) who need to be hospitalized cannot be trans- 
ferred to the IW due to the unavailability of beds, creating a well-known phenomenon of blocked beds in the 
ED, i.e., beds that are occupied by patients who finished their treatment in the ED; see, e.g., f3l for a current 
review. We further refer to [2 1 for a data-based study of queueing aspects in hospital settings, as well as for 
an extensive literature review. 

In [6 1, a fluid approximation of an ICU experiencing periods of overload periods is studied, in which the 
service rate of current ICU patients increases (is "sped-up") if the number of patients that are waiting to be 
admitted to the ICU exceeds a certain threshold. In turn, the sped-up patients have an increased probability 
of readmission to the ICU, so that alleviating overloads by employing speedup increases future overloads. 
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The fluid model in 161 builds on an AP as an engineering principle, in the spirit of our work ||26l. 

Based on an extensive empirical study of a Singaporean hospital, a complex stochastic network model 
is designed and simulated in fdT], in order to approximate the inpatient operations in that hospital. During 
periods of congestion with large waiting times, ED patients may be sent to a non-designated IW if a wait- 
time threshold is crossed. That threshold changes dynamically with the time of the day. In the stochastic- 
network terminology, that overflow procedure accounts for an activation of a non-basic activity. In our 
model, a non-basic activity relates to the possibility of routing customers from one class to its non-designated 
service pool, and such an activity is active when the appropriate activation threshold is crossed. 

In ll30l a fluid model is used to study the effects that IW discharge timing has on the ED boarding times. 
Based on that model, the authors conclude that shifting the IW discharge time, to take place before the 
ED experiences its peak demand, can eliminate the overload in the ED. This conclusion, however, is not 
supported by the findings of Il33l . 

Congestion Collapse and Oscillations. In our model, the more overloaded the system is, the more agents 
are working with shared customers, so that throughput due to service decreases if the service rates of shared 
customers is lower than that of the designated customers. Moreover, as we showed in ll25l and [26 ], even an 
underloaded system can go through "congestion collapse", creating severe overloads when an inappropriate 
control is employed. Congestion-collapse phenomenons are also observed in |[T9l and ||32l . where increased 
loads may cause decreasing throughput rates, if an inadequate service discipline is employed. 

Another feature of overload controls is that they might cause oscillations during normal loads. See ^ 
below for oscillatory behavior that may occur in our model if FQR-ART is employed inadequately. Other 
Examples of oscillatory behavior appeared in the literature; see, e.g., iflOl . and the example in Figure 2 in 
0. 

There is a large body of literature on overload controls for Stored Program Control (SPC) systems. 
An extended list of references can be found, for example, in ll20l . where three basic overload controls of 
SPC-switches are discussed. It is shown that a certain priority discipline may cause severe oscillations 
in the system, creating large queues even if the system is underloaded. (See Section 4 in that reference.) 
Two important points are made, that are directly related to our work here: First, it is argued that the study of 
time-dependent (non-stationary) behavior of such stochastic systems during overload periods is an important 
research direction. Of course, fluid models should play a significant role in such settings, due to intractability 
of the stochastic transient behavior. Second, that any overload control should ensure that call delays do not 
increase during normal load periods. For example, a control must not generate oscillations in the manner 
described above, or create harmful congestion in the manner described in lf25]| and ||34]| . 

2 The Time- Varying X Model 

We now describe the stochastic model in detail. The X depicted in Figure [T] has two customer classes and 
two agent pools, each with many agents. We assume that each customer class has a service pool primarily 
dedicated to it, but all agents are cross-trained so that they can handle calls from the other class, even though 
they may do so inefficiently, i.e., customers may be served at a slower rate when served in the other class 
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The X Call-Center Model 
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Figure 1 : Ttie X model 



pool. We assume tliat botli customer classes arrive according to independent nonhomogeneous Poisson 
processes witli time-varying rate functions. Tlie staffing levels are assumed to be time dependent as well, 
possibly anticipating the changes in the arrival rates. The arrival rates may be unknown, but we treat them 
as deterministic functions, i.e., we do not enforce any distributional assumptions on these rates. We let fiij 
denote the service rate of class-z customers served in service pool j, i,j = 1, 2, and assume that service 
times are independent exponential random variables. Each class has an infinite buffer where customers who 
are not routed immediately into service upon their arrival are waiting for their turn to be served. Within each 
class, customers are served according to the first-come-first-served discipline. We further assume that each 
customer has a finite patience that is exponentially distributed, and will abandon if his waiting time in queue 
exceeds his (random) patience. We denote by 9i the patience rate of class i, i = 1,2. 

Even though we do not prove any limit theorems here, and instead develop direct fluid models to approx- 
imate the stochastic system, we will use asymptotic considerations in our analysis. We therefore consider a 
sequence of X systems, as just described, indexed by a superscript n. 

Specifically, for each n > 1 we let {A"(t) : t > 0} denote the arrival-rate function to pool i, and 
{m"(t) : t > 0} denote the number of agents in pool j as a function of time, i,j = 1,2. For the fluid 
approximation we assume that 

A"(-)/n — > Ai(-) and m!j{-) /n ^ mj{-) as n — oo, (1) 

where Xi,mj : [0, oo) i— [0,oo), i,j = 1,2. A rigorous treatment (which will not be carried out here) 
requires assumptions on the limit functions in ([T}. A minimal assumption, that is clearly not restrictive in 
applications, is that those limits are differentiable except possibly in a finite number of points. Therefore, 
the arrival rates and staffing functions can have points of discontinuity (have jumps), but the number of such 
points is finite. See Remark |57T] below. 



7 



For all n > 1 let Q^{t) denote the number of customers waiting in the class-i buffer and Zf-{t) be the 
number of class-i customers in service pool j at time t in system n. Define 

X^^X^{t)^{Q^{t),Zl^{t):i,3 = ia). t>0. (2) 

If a control is employed that uses only information on the current state of at each decision epoch, then 
X"' is a nonhomogeneous continuous -time Markov chain (CTMC). 

Since we are interested in fluid analysis, it is reasonable to consider three possible regimes: underloaded, 
normally loaded and overloaded. We let {pf{t) : t > 0} denote the traffic-intensity function to pool i in 
system n, i.e., p^{t) := X^{t)/{fii^im^{t)), and consider the following limits i = I and i = 2. 

lim(p?(f)-l) = ft(t), t>0. (3) 

We say that pool i is underloaded at time t if /3j(t) < 0, overloaded at time t if /3{t) > and normally 
loaded at time t if /3j (t) = 0. 

The interesting, and realistic, scenarios to investigate are when the system switches between different 
loads, e.g., from having both pools normally loaded to an overload case and back to normal, but remains at 
each regime for some non-negligible amount of time. We therefore assume that the system does not switch 
regimes too quickly, e.g., if pool 1 is overloaded at time t (/3i(f) > 0) and class-1 customers receive help 
from pool 2, then this overload incident and sharing take place for a few mean service times. 

As we already mentioned above, it is typical in the call-center literature and in applications, to consider 
piecewise-constant arrival rate functions, and corresponding piecewise-constant staffing functions. We will 
thus consider three types of models: 

(a) Stationary models. The arrival processes are homogeneous Poisson with fixed rates and constant 
staffing functions. We will assume that those functions have changed at time 0, and then consider the 
transient behavior of the system as it approaches a new steady state. 

(b) Piecewise-stationary models. The arrival processes are nonhomogeneous Poisson processes with 
piecewise constant intensity (rate) functions. The staffing functions will also be assumed to be piecewise 
constants, and the system will switch between different loads. 

(c) Time- varying model. The arrival rates and staffing functions are general functions of times, but, 
as mentioned above, we assume that they are continuously differentiable, except possibly on a finite set of 
points. 

Clearly, the time-varying model (c) is the more general and includes the first two types of models. In 
models (b) and (c) the staffing functions may change in accordance to anticipated changes in the arrival 
rates. However, in all three models, the realized arrival rates may be different than anticipated, which can 
cause unexpected overloads. 

2.1 Two Controls: FQR-T and FQR-ART 

We now elaborate on the two controls which were already described briefly in fJT] For each n > 1, the FQR- 
T control is based on two positive (activation) thresholds, 2 ^^id fcg i and the two queue-ratio parameters. 
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ri 2 and r2,i. We define two (centered) queue-difference stocliastic processes 

Dl^it) ^ QUt) - - n^2Q^it) and Dl.it) ^ r^^iQ^t) - - Q^it), t>0. (4) 

As long as D'i2{t) < and i(0 < we consider tiie system to be not overloaded so that no 
customers are routed to be served in the other class' pool. Once one of these inequalities is violated, the 
system is considered to be overloaded, and sharing is initiated. For example, if Df2{t) ^ 0, then class 1 is 
judged to be overloaded (because then Q" — ri^2Q2 ^ ^12)' ^^'^ it is desirable to send class-1 customers to 
be served in pool 2. Note that Df2{t) ^ does not exclude the case that class 2 is also overloaded; we can 
have /?! (t) , /32 (i) > 0. However, once one of the thresholds is crossed, its corresponding class is considered 
to be "more overloaded" than the other class. We refer to this situation as unbalanced overloads. We refer to 
the thresholds A;" 2 and kl^ i as activation thresholds, since the crossing of one of these thresholds activates 
sharing (although they also serve as a mean to prevent sharing when it is not required). 

The behavior of in Q depends on the choice of the thresholds kfj. In particular, we want the 
thresholds to be large enough so that sharing will not take place if both service pools are normally loaded, 
and to be small enough to detect any overload quickly, and start sharing in the correct direction once the 
overload begins. Note that without sharing, the two pools operate as two independent M/M/m" + M 
(Erlang-A) models. The fluid and diffusion limits for the Erlang-A model give insight as to how to choose 
these thresholds; see |fT3l . In our previous papers we assumed that the activation thresholds are chosen to 
satisfy the following asymptotics: 

k^j/n — )- and k'^j/\/n -^00 as n — )- 00, i,j = l, 2. (5) 

The first limit in ensures that overloads are detected quickly (immediately as n — 00) since the fluid 
limit of the queue in a pool that starts normally loaded (with no fluid queue) and becomes overloaded 
will become strictly positive immediately after the overload begins. The second limit in ^ ensures that 
stochastic fluctuations of normally-loaded pools will not cause undesired sharing, since the diffusion-scaled 
queue in that case are of order ^/n. 

In the call-center settings, if sharing of customers takes place during overloads only, it is reasonable to 
assume that agents serve the other class customers (the so-called "shared customers") at a slower rate than 
their own designated customers. Thus, substantial sharing can reduce the effective service rate of the helping 
pool. To avoid sharing in both directions simultaneously, we enforced the one-way sharing rule described in 
f|T] In the time-varying settings considered here, it is clear that the one-way sharing rule may considerably 
slow the response to a change in the direction of overloads. We elaborate in f|3]below. 

To remedy this problem, one can remove the one-way sharing rule altogether and rely solely on the 
activation thresholds to avoid undesired sharing. However, removing that rule can have harmful effects on 
the system. First, there is a need to choose activation thresholds that are higher than if the one-way sharing 
rule is employed, increasing the time until overloads are detected. Moreover, if these thresholds are too large, 
then some overloads may not be detected at all (recall that abandonment keep the queues from increasing 
indefinitely.) Second, if sharing is taking place in one direction, and then immediately starts in the other 
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direction in response to a switch in the overload, then the combined service capacity of both pools may be 
reduced significantly, creating a period of severe congestion in both directions. Hence, it is beneficial to 
avoid too much simultaneous two-way sharing. 

Adjusting the Control: FQR-ART. For the reasons discussed above, we suggest a modification of the 
one-way sharing rule by introducing release thresholds (RT). For each n > 1, we take two strictly positive 
numbers rf 2 and T21 and modify the control in the following manner: A newly available type-2 agent is 
allowed to take a class-1 customer at time t only if the number of type-1 agents serving class-2 customers at 
the same time t is below (and of course D^i{t) > 0), i.e., if ^2 i(*) — '^2 1' ^^'^ similarly in the other 
direction. 

The new release thresholds make activation thresholds satisfying ^ not suitable for the time-varying 
environment, as will be shown in f]4] below. Instead, these activation thresholds must be positive in "fluid 
scale", i.e., they should be chosen so as to satisfy 

lim k^Jn = h j > 0, i, j = 1, 2. (6) 

To summarize, the FQR-ART control is a modification of FQR-T in order to adjust for the time- varying 
environment considered here. For each system n, the FQR-ART control is specified by the six parameters 
(?'i,2; ^'2,1, ^12) ^2 i5 "^""21 '''21) ^'^d routing and scheduling rules which depend on the values of the two 
processes D"^- and Z^-, i ^ j, in the manner described above. We mention that FQR-T requires knowing 
only the values of the queues at each time t (specifically, the values of the two difference processes 
whereas FQR-ART also requires knowledge of Z{*2 and However, under either control, the X model 
is a (possible inhomogeneous) CTMC. 

2.2 Analysis Via Fluid Approximations 

Since the stochastic process X" in ^ under FQR-ART is evidently too difficult to analyze exactly, we 
will employ a deterministic dynamical-system approximation, and refer to that approximation as "fluid 
approximation" or "fluid model" interchangeably. The main idea in using fluid approximations is that, for 
large n, X"- x, for some deterministic function x that is easier to analyze than the untractable stochastic 
process X". (We use the 'bar' notation throughout to denote fluid scaled processes, e.g., X" = X^/n.) In 
particular, the fluid counterpart of X" in ^ is the six-dimensional function 

x = x{t) = {qi{t),z^,J{t) -.ij = 1,2), t>0, (7) 

where qi and Zij are the fluid approximations for the stochastic processes Qf and Zf^, i,j = 1,2. 

The function x should formally arise as a functional strong-law of large numbers (FSLLN), and thus 
capture the time-dependent mean behavior of the process X". 

Note that, in the stochastic system, routing of customers among the two pools depends on the values 
of the difference processes in dD. For example, if sharing is taking place with pool 2 helping class 1, and 
assuming < T21, the process D^2 determines which customer class a newly available type-2 agent will 
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take. Unfortunately, in the fluid system we cannot simply replace the process with a process 

di,2ii) = qi{t) - h^2 - ri,2q2{t), t > 0. 

In fact, the purpose of the control is to keep di,2(0 = during the overload. Hence, to determine the 
evolution of the fluid model a refined analysis of the behavior of 2 required (or D2 i during overloads 
in the other direction). We elaborate in f|5]below, where the fluid equations are developed. 



3 Relaxing the One- Way Sharing Rule by Adding Release Thresholds 

Relying on fluid considerations, we now demonstrate why the one-way sharing rule is not adequate to the 
current time- varying settings. As mentioned in ^2.11 above, the system can be slow to respond to switch- 
ing overloads if the one-way sharing is employed. The simple fluid analysis suggests that having release 
thresholds eliminates fixes that problem. 

3.1 Drawbacks of One- Way Sharing in Time- Varying Settings 

We consider two consecutive time intervals Iq = [to, ti) and Ii = [ii, with < < *i < *2 < cc, with 
the system being overloaded in opposite direction over each interval. Suppose, for example that over the 
time interval Iq class 2 was overloaded and sharing took place with pool 1 helping class 2. Then, at time ti 
the loads have changed with class 1 becoming overloaded in such a way that sharing is required in the other 
direction. In particular, we assume that /3i(t) > for t € /i and that 2:2,1 (ti) > 0. 

Recalling that the fluid approximation has asymptotic significance, namely, that 2:2,1(0) > implies 
that ^2 1(0) is proportional to n (and n is large), the mean time to wait until pool 1 has no more class-2 
customers is, for large n, 

1 ,_, log(^g,i(0)) ^ 

jr[ j ■ /^2,i /i2,i 

where log(-) is the natural logarithm, i.e., log(-) = logg(-). We thus see that the time it takes a pool to empty 
from its shared customers is of order log(n) if n is large, when no new shared customers are routed to that 
pool. 

A fluid approximation for the evolution of can easily be derived using rate considerations. Since 
every type-1 agent who is helping a class-2 customer at time t > ti will finish service immediately after 
time i at a rate /X2,i, regardless of the value of t, due to the memoryless property, and since there are no more 
class 2 customers routed to pool 1 after time ti, we expect that 2:2,1 will satisfy the ODE 

Z2,lit) = -//2,l2:2,i(t), t G /i, 

whose unique solution is 

Z2,l{t) = Z2,i{h)e-''^'^\ tG[ti,t2). (9) 
In particular, in the fluid model, if 2:2,1(^1) > 0, then pool 1 will never empty, so that sharing can never 
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begin in the opposite direction. 

3.2 Adding the Release Thresholds 

The simple considerations leading to ^ and ^ show that a large system will be slow to react to changes in 
the direction of overloads, and will require an order of log(n) time units to switch the direction of sharing if 
the one-way sharing rule is employed. 

However, the fluid model Q suggests that, if 2:2,1(^1) > ^"2,1 and r2,i > 0, then 2:2,1 will reach r2,i in 
finite time. In particular, r2,i will be hit at time 

fJ'2,1 V T2,l J 

This observation leads to the need of having release thresholds which are chosen as follows: For two strictly 
positive numbers ri,2 and r2,i we take two sequences {rfg} and {rl^i}, satisfying 

Ti2/n — )• ri,2 and r^i/n — ?• T2,i as n — )• 00. (10) 

Then, an available type-2 agent is allowed to serve a class- 1 customer only if the proportion of type-1 agents 
serving class-2 customers is below T21 (and of course D^^{t) > 0), and similarly in the other direction. 

3.3 Simulation Experiments 

To illustrate the importance of the release thresholds for (finite) stochastic systems, we compared the per- 
formance of a system with and without release thresholds in simulation. The results can be seen in Figures 
|2]and|3] The parameters for this simulation are 

= m^ = 1000, = 1200, = 990, ^1,1 = ^2,2 = 1, f^i,2 = ^2,1 = 0.5, 
«^i,2 = «^2,i = 100, and r = 1. 

(Here, we can think of n as being fixed and equal to 1000.) With these parameters, queue 1 is overloaded, 
while queue 2 is underloaded. To respond to that unbalanced overload by having pool 2 help class 1, we 
should have > and Z21 = if one-way sharing is employed. However, we initialize the system with 
sharing in the opposite way. We are interested in the time it takes the stochastic process Z'21 to reach 0, so 
that the desired sharing can begin. With release thresholds of only rf 2 = T21 = O.Oln = 10, that time is 
reduced from about 21 mean service times to about 9 service times. Thus, clearing the last 1% of the class-2 
customers in pool 1 without release thresholds takes more than half the total clearing time. 

Indeed, we consider an extreme example in which all of service pool 1 is initially busy with customers 
from class 2, and none of the type-2 agents are busy serving class- 1 customers. We do this in order to convey 
the message that it is the last few agents working with class 1 that cause the delayed response. In particular, 
the Z21 process decreases fast at the beginning, but then the decrease rate slows down considerably. 

From Figures |2] and |3] it is also easy to see what happens in less extreme cases, when < Z2,i(0) < 
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mi. For example, if we initialize with 20% sharing in the wrong direction, we see that, without a release 
threshold, the time to activate sharing in the right direction is about 21 — 4 = 17 time units. In contrast, 
with release thresholds, it is about 9 — 4 = 5 time units. When we start with a lower percentage of agents 
sharing the wrong way, the difference becomes even more dramatic, because we eliminate a common initial 
period (here of length 4 time units). 



Proprtions Serving Others In Both Pools 



Proprtions Serving Others In Both Pools 




10 15 20 25 30 35 



1 5 20 2S iO 35 



Figure 2: Sample paths of Zi^2{t) and 
-^2,1 (i) initialized incorrectly, without re- 
lease thresholds. 



Figure 3: Sample paths of ^1,2 (i) and ^2,1 (i) 
initialized incorrectly, with release thresholds 

= T2A = 0.01. 



4 Congestion Collapse Due to Oscillations 

The simple Fluid consideration in ^3.11 and its application to the stochastic system, demonstrates the need 
for having release thresholds when the direction of overload switches. However, an overload period can also 
end with a return to normal loads, so that no sharing (in either direction) is required. We now show that the 
release thresholds can cause serious problems when the system returns to normal loading after an overload 
incident if the activation thresholds are chosen according to (O, as in the FQR-T control. The main problem 
that can arise in these setting is when the inefficient sharing condition holds, namely when 

A*i,i > /^2,i and ;U2,2 > Aii,2 (11) 

which is what we now assume. 

Let time be the time of that an overload period ends. For simplicity, we assume that both the arrival 
rates and staffing levels remain constat for alH > 0. For our purpose here, we further assume that 

Ai = /ii,imi and A2 = /i2,2"i2, (12) 

so that both pools would be normally loaded if there was no sharing. However, since we consider a system 
that is recovering from an overload, we assume that ^2,i(0) > T2,i (implying that 2:2,1 (i) > for all t > 
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as well, by Now, for each time t > 0, the total effective service rate of pool 1 is 



/"l,l("T'l - Z2,l{t)) - /X2,l2:2,l(i) = Ai - (/Ui_i - /i2,l)^2,l(*) < -^1 



where the equality follows from ([T2] | and the strict inequality follows from (ITTI ) and the fact that 2:2,1(0 > 
for all t > 0. That implies that pool 1 is overloaded for all t > 0, even though it would have been normally 
loaded if there were no class-2 customers in the pool. 

Let to = inf{i > : 22,1 (t) = T2,i}. Since pool 1 is overloaded and no class-1 customers are sent to 
pool 2 on [0, to), it holds that qi{t) > for all t G (0, to]. If the number of class-1 customers in pool 2 is 
negligible initially, i.e., 21,2(0) = 0, then (72 will approach 0. Let ti = mi{t > : qi{t) — ri 2(?2(i) = ^1,2}- 
If the activation thresholds are chosen according to dD, so that /ci,2 = 0, then class-1 customers will be sent 
to pool 2 at time t2 = max{to, ti}, i.e., 21,2(0 > for * ^ By the same reasoning given above, pool 
2 is then overloaded for all t > t2, so that both service pools are overloaded for all t > t2- 

This problem may seem marginal for finite systems if we choose Tij to be sufficiently small (which we 
do). However, Tij small only ensures that one of the pools is not "too overloaded" at any time, e.g., if at 
a time t > t2, 21^2(0 < ''"i,2> then we still might have 22,1 (t) relatively large, so that the effective service 
capacity of pool 1 is substantially reduced for a long time interval. The simple analysis above suggests 
that the system may start oscillating with Zij growing large and then decreasing over and over again. Such 
oscillatory behavior may cause congestion collapse due to the effect it has on the long-run average service 
rate, namely, for some nonnegligible Cij > Tij, it may hold that 



In particular, the long-run average service rate of the two pools combined can potentially be much lower 
than Ai + A2, making the system severely congested. 

4.1 Simulations of Oscillating Systems 

The oscillatory behavior of the system is hard to observe when there is abandonment. Thus, for display 
purposes, we start with an extreme case of a system with no abandonment, and then simulate a more realistic 
system with abandonment. Figures |4] and |5] show a single simulated sample path of a system case with 
^1,2 = /^2,i = 0.1 and no abandonment. The service rates for the designated classes are /^i 1 = 112^2 = 1> 
there are 100 agents in each pool, and the arrival rates are Ai = A2 = 98, so that both pools are stable if 
there is no sharing. The two types of thresholds are taken to be k^j = 10 and t^- = 1, i,j = 1,2. The 
symmetry of the system implies that both pools and queues exhibit the same behavior. 

A more realistic example is shown in Figures |6]and|7] where ^12 = ;U2,i = 0.5 and the abandonment 
rates are 9i = 62 = 0.5. The other parameters are the same as in the previous extreme case. The oscillations 
of the service process are harder to observe, so we compare to a simulation of the exact same system, but 
with the activation thresholds increased to A;"^ = 35, shown in Figures |6] and |7] 

Since the activation thresholds in FQR-ART should be of order n asymptotically, as in Q, we can use 
simple fluid considerations to compute good values for these thresholds in the fluid approximation, i.e., to 



liminf— / Zij(s)ds = Cij i,j = 1,2. 
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Figure 8: Z\^2 with larger thresholds: fcj = 35. Figure 9: Q2 with larger thresholds: ki^j = 35. 

find candidates for fci 2 and A;2,i- (Engineering considerations are needed.) An example on how to choose 
the thresholds is given in the appendix. 
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5 The Fluid Model 



The fluid model for the stochastic system X" under FQR-ART is described implicitly as the solution to an 
ODE. In this section we derive that ODE via a heuristic representation of the inhomogeneous CTMC in ([2]l. 

5.1 Representation of the Stochastic System During Overloads 

The sample paths of a queueing system are represented in terms of its primitive processes, i.e., the arrival, 
abandonment and service processes, as a function of the control. Unlike traditional fluid models, in which 
the primitive stochastic processes are replaced by their long-run rates, the deterministic fluid model here 
is more involved and includes a stochastic ingredient in the form of a stochastic AP, which we describe in 
detail in ^5.2| below. 

Even though we are not proving that the fluid model arises as a weak limit of the fluid-scaled stochastic 
system, we need to take asymptotic considerations in order to develop the fluid approximation. We thus start 
with a representation of the stochastic system during overloads, assuming that both service pools are full 
over an interval [0, T], namely that 

Z^At) + Zl,{t)=mUt) and Z^^it) + Z^^it) = m^{t), t e [0,T]. (13) 

We next use random time-changes of independent unit-rate Poisson processes to represent the sample 
paths of X", as reviewed in Il24l : see Equations (41)-(43) in pSi for such a representation applied to the X 
model operating under FQR-T. For example, the representation of Q" over [0, T] is 




where N^, Nf, and are mutually independent unit rate (homogeneous) Poisson processes, and Ia 
is the indicator function that is equal to 1 if even A occurs. 

Note that the representation of Q" is essentially a flow equation (based on the memoryless property 
of the exponential distribution). That is, the queue at time t is all those customers who arrived by that 
time, captured by the Poisson process Nf, minus all the customers that abandoned, captured by the Poisson 
process Nf-, minus all those who were routed into service, as captured by the last two Poisson processes in 
the expression. 

We elaborate on how the intensities of the last two Poisson processes in the right-hand side (RHS) of 
the representation were obtained. First, if at time s € [0, T] the event 'Di^2{s) = {^i 2(^) > n i(s) < 
T21} holds, then any newly available agent in the system will take his next customer from the head of 
queue 1. Since agents become available at an instantaneous rate J2i j f^i,jZf'j{s) at time s, we get the third 
component in the RHS of Qi{t). Next we recaU that, by the routing rule of FQR-ART, if at a time s G [0, T] 
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P2,i('S) = {{^21(3) > 0)n(Z"2(s) < rf 2)} holds, then any newly available agent takes his next customer 
from queue 2, in which case queue 1 will not decrease due to a service completion. If neither of the events 
^1,2(5) or 1^2,1 (s) holds at a time s, then only service completions at pool 1 will cause a decrease at queue 
1 due to a customer from that queue being routed to service. That explains the last term in the RHS of the 
representation. 

Next, we exploit the fact that each of the Poisson processes in the representation minus its random 
intensity constitutes a martingale (again, see |[24l '). e.g.. 



is a martingale. Thus, subtracting and then adding all the random intensities, and using the fact that a sum of 
martingales is again a martingale, we get the following representation for the processes Q", Q2, -Z^f 2) ^21 
(the remaining two processes Z^-^ and Z22 are determined by ([T3Tl). 



QUt) = M^{t) + / XUs)ds - / eiQUs)ds 
Jo Jo 

- / l{{Z)72(s)>0}n{Z£^(s)<r2"_j}} (/^l,l^r;i(s) + /^l,2^r;2(^) + /^2,1^2,l(s) + /^2,2^2,2 (^)) ds 
•J 

- / (1 - l{{D5^2(s)>o}n{Z2"i(s)<T^' J} - l{{D^_i(s)>o}n{Zi"2('')<r{^2}}) + /^2,i^2,i(s)) ds, 

J 

Q2 (i) = M-^it) + r A5(s)ds - r e2Q^(s)(is 

JO JO 

l{{DJ_i(s)>0}n{Zi"_2(s)<T;'2}} (A*l,l^r;i('5) + /^1,2^",2('S) + /^2,1^2,l('S) + /"2,2^2,2(5)) ^^'S 

t 

(1 - l{{D5^2W>0}n{^2"iW^'^2;i}} ~ ^{{^?,i(*)>0}n{^r,2W<^"2}}) (^2,2^2,2('5) + /^l,2^r,2('5)) ^S, 
^"2(0 = ^l"2(0 + / l{{OT2(s)>0}n{Z2"i(5)<T2"i}}^2,2-^£2('5)c^S 

JO ■ ' ' 

(1 - l{{D5^,2W>o}n{^2"iW^'^£i}}'*^i'2^"2(s)^^'S, 

Z£i(i) = M2"i(i) + / l{{D"2W>o}n{Z2"i('5)<T-2\}}^i.i'^"i('*)^* 
JO ' ' ' 

- / (1 - l{{D2"i(^)>o}n{zr.2(^)<^i"2}})^2"i(^))^^' 

JO 



rt 



(14) 



where M", M^, M"2 and M^*]^ are the martingale terms alluded to above. It is not hard to show that those 
martingales are negligible in the fluid scaling, i.e., that M" =^ and Mf- =^ as n — )• 00, uniformly 
over [0, T], z, j = 1, 2; see, e.g.. Lemma 6.1 in |[28l . Hence, we consider those martingales as a negligible 
stochastic noise that can be ignored for the purpose of developing the fluid approximation for ([141 ). 

To replace the stochastic integral representation in (IT4l with a deterministic one, we need to replace the 
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indicator functions with smooth functions. We start by assuming that there is a fluid counterpart x for X" 
in (fT4l ) which is continuous and dijferentiable. (This fact can be shown to hold by a minor modification of 
Corollary 5.1 in 1*281). For any fluid point x{t), let 

di,2{x{t)) = qi{t) - ri^2q2{t) - ki^2 and d2,i{x{t)) = r2,iq2(t) - qi{t) - k2,i. (15) 

We first observe that, if dij{x{t)) > then, since dij{-) is a continuous function, j is strictly positive 
over an interval, and similarly if dij < 0, i, j = 1, 2. In such cases the indicator functions are easy to deal 
with because each is a constant over the interval, and equals either 1 or 0. For example, if di_2(x(t)) > 
for t S [si, S2), for some < si < S2 < 00, and in addition, .^2 i(0 ^ '^21 o^^^" "^^at interval for all n large 
enough, then 

i{{Di^{t)>o}n{zi^{t)<T^J} = 1{[S2,S2)} for all n large enough. 
Hence, a careful study is required for all x{t) =7 in the boundary sets defined by 

Bi,2 = {7 e Me : (^1,2(7) = 0} and B2,i = {7 G 1^6 : ^^2,1(7) = 0} (16) 

Note that FQR-ART aims to "pull" the fluid model to one of these two boundary sets during overloads, when 
sharing is actively taking place, i.e., Bj j is the region of the state space where we aim the fluid model to be 
when pool j helps class i,i,j = 1, 2. 

Unfortunately, there is no straightforward fluid counterpart to the stochastic processes and L>2 1 
when the fluid is in the boundary sets. However, there are two related stochastic processes, operating in an 
infinitely faster time scale, whose behavior determines the evolution of the fluid model, as we now explain. 

5.2 A Stochastic Averaging Principle 

Before we explain how to deal with the indicator functions in the representation ([T4] |. we emphasize that the 
following explanation is for the purpose of gaining insight only. The explanation draws on results in (28], 
which were proved in different settings than here. 

Assume, for example, that x{t) G Bi 2 and consider -D" 2- To be able to apply the results in ||28l . we 
assume (for now) that the arrival rates are fixed (the arrival processes are homogeneous Poisson processes) 
and that Z21 < T2,i, so that routing is determined solely on the value of £'12- In particular, sharing can take 
place if -Di,2(0 > 0- Then, by Theorem 4.5 in 1,28], 

Dl2{t) ^ Di^2{x{t),oo) in M as n ^ 00, (17) 

where 1)1,2(75 •) = {^1,2(7, s) : s > 0} is a CTMC associated with 7 G Mg whose distribution is deter- 
mined by the value 7. (There is a different process for each 7.) 

An analogous result holds for 1 when x{t) G B2,i. The notation Djj(7,oo) stands for a random 
variable that has the steady-state distribution of the CTMC Dij{'y, •). Loosely speaking, Wl^j moves so fast 
when x{t) is in Bjj, that it reaches its steady state instantaneously as n — )• (X). Hence, we call Dij{j, •) the 
fast-time-scale process (FTSP) associated with the point 7, or simply the FTSP. 
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Since we are interested in analyzing the indicator functions in (l20l ). we first define 



A,j(7r) = +00 if dij{j)>0 and Dij{-f ,■) = -00 if ^^^(7) < 0, 7 e Mg. 
Next, we define 

7ri,2(7) = -P(-C>i,2(7,oo) > 0), for 7 G Bi,2 and 

(18) 

7r2,i(7) = -P(-C>2,i(7,oo) > 0), for 7 G B2,i. 

Now, by Theorem 4.1 in ll28l . which was proved for the process D^2 when x G Bl^2^ and assuming that 
^2 i(s) ^ ^2 1 ovsr [^1)^2] for all n large enough, we have that 



/ 

Ju 



t2 rt2 
^{{■D5^2W>o}n{^Ji(s)<T2"J}'^s =^ / vri,2(x(s))(is. 



Similarly, if x G B2,i over an interval [^3, t4], and 2(5) < rf 2 for all n large enough over that interval, 
we have 

/ l{{DJi(s)>o}n{Zf2{s)<ri"2}}C^s / 7ri,2(2;(s))(is. 

Jt-i ' • ' Jt-i 

The convergence in both equations above holds uniformly. 

We named this latter result a "stochastic averaging principle", or simply an averaging principle (AP), 
since the process Dfj (t) is replaced by the long-run average behavior of the corresponding FTSP Dij (x(t) , ■] 
for each time t over the appropriate interval. 

In the FQR-ART settings, the AP holds under the assumption that Zf-j lies below the appropriate release 
threshold over the interval [ti, t2] for all n large enough (i.e., with probability converging to 1 as n — )• 00). 
If is above the appropriate release threshold for all n large enough (again, with probability converging 
to 1) over [ti,t2]> then the limit of the integral considered above is clearly the function. It remains to 
rigorously prove convergence theorems at points at which Z'^-{t) = r"^ + op{n), where op{n) denotes a 
random variable satisfying op{n)/n as n — )■ 00. However, it is not hard to guess the dynamics of the 
limit (if it exists) at such points, as we do in our fluid approximation below. 



5.3 Representation via an ODE 

The above limiting arguments lead to the following fluid approximation for the X system under FQR-ART 
during overload periods. Considering an interval [0, T] for which 

zi,i{t) + Z2,i{t) = mi{t) and Z2,2(t) + Z2,2(0 = rn2{t) for all t G [0, T], (19) 

together with an initial condition x{0), the fluid model of X" is the solution x = {x{t) : t > 0} over [0, T] 
to the ODE: 
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Ql{t) = - 9iqi{t) - Ui^2ixit)) + /Xl,22l,2(i) + fJ'2,lZ2,l{t) + l^2,2Z2,2{t)) 

- (1 - ni,2(2;(t)) - U2,l{xit))) + fl2,lZ2,l{t)) , 

92(0 = A2(t) - 6'2g'2(i) - n2,l(x(t)) + /il,2Zl,2(i) + Ai2,lZ2,l(t) + lJ'2,2Z2,2{t)) 

- (1 - ni,2(a;(t)) - n2,l(a;(t))) {^l2,2Z2,2{t) + m,22l,2(t)) , 
il,2(i) = ni,2(x(t))/i2,2^2,2(t) - (1 - ni,2(2;(t)))/ii,22l,2(t), 

i2,i(i) = n2,i(x(t))/ii,izi,i(t) - (1 - n2,i(a;(t)))/i2,iZ2,i(t), 

m2{t) = i2,2(i) + Z2,i{t). 
where, for TTij{x{t)) in ( fTSl) . j = 1, 2, 



Note that the ODE (l20l) can be equivalently represented by an integral equation resembling ([T4] |. but with 
the negligible martingale terms omitted, all the stochastic processes replaced by their fluid counterparts, and 
the indicator functions replaced by the appropriate Ilj functions. 

5.4 The Fluid Model When There is No Active Sharing 

The ODE for the fluid model above was developed for all cases for which both pools are full, i.e., ([T9] l holds. 
This is the main case because systems are typically designed to operate with very little extra service capacity 
(if any), and is clearly significant when overloads occur. In particular, note the a normally-loaded system, 
with /?! (t) = (32{t) = 0, will have the two pools full, at least after some short time period. Since the system 
may go through periods in which at least one of the pools is underloaded, we now briefly describe the fluid 
models for underloaded pools. 

Consider an interval / C [0, oo). If no sharing takes place and ^i,2(i) = Z2,i{t) = for all t ^ I, then 
the two classes operate as two independent Erlang-A models over that interval /, to which fluid limits are 
easy to establish. Specifically, assuming without loss of generality, that I = [0, s) for some < s < oo, the 
fluid dynamics of both classes obey the ODE 



In the time-invariant case, the unique solution for a given initial condition to the ODE in (|2TI ) is easily seen 




7rij(x(t)) if Zj^i{t) < Tj^i, 
otherwise. 



f ifq^{t)>0, 
[ Aj(t)l{2, .(4)<m,(i)} - ifqi{t)=0. 



(21) 
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to be 



mi,i 

^ + (^m(O) - ^) e"^M* if,,(t) = 0. 



where aVb = max{a, 6} and ((7i(0), 52(0), zi^i(O), 22,2(0)) is a deterministic vector in [0, cxd)^ x [0, mi] x 
[0,m2]. 

If 2i,2('So) > (or 2:2,i('So) > 0) for some sq > and there is no active sharing over the interval [sq, si), 
then zi 2 (22,1) is strictly decreasing over that interval. Then Zij, i / j, satisfies the ODE 

Zij{t) = -fIijZij{t), So<t<Sl 



which is the same as the ODE for Zij in (1201 ) with Uij = 0. 

Remark 5.1. A proof of existence of a unique solution to the ODE (l20l ) requires showing that the RHS is 
a local Lipschitz continuous function of x and is piecewise continuous in t. We do not prove such a result 
here, but it is important to consider arrival rates and staffing functions that ensure that the RHS of the ODE 
satisfies the piecewise continuity condition in the time argument. 



6 Solving the ODE 

To compute the solution to ( [201 ) requires a computation of tti 2(x(t)) and 7r2,i(x(t)) for all x{t) E Mg- 
Simplification is achieved when ri 2 = ^2,1 = 1, because the FTSP's Dij{x{t), •), i,j = 1,2, become 
simple birth-and-death (BD) processes. To facilitate the discussion we thus consider this simpler case and 
refer to §6.2 in [21 \ for the treatment of the FTSP Di^2 as a quasi-birth-and-death process (QBD) when 
the ratio parameters are not equal to 1. (In ITTl FQR-T is studied with one overload incident, with pool 1 
receiving help, but the same method can be applied to 1)2,1 with sharing in the opposite direction.) 

For simphcity, we again start by assuming that the arrival processes are homogenous Poisson processes, 
having constant arrival rates Ai and A2 over [0, T], and that the staffing functions are also fixed over that time 
interval at mi and m2. Recall that Dij^-y, •) = 00 if dij{'y) > and Dij{y, ■) = —00 if dij{'y) < 0, and 
let Ai_2 and A2,i be the subsets of Mg in which the FTSP's I?i,2(7i •) and -D2,i(7i •) are positive recurrent, 
i.e., 

Ai 2 = {7 G ]Bi,2 : < 7ri,2(7) < 1} and A2,i = {7 G 182,1 : < 7r2,i(7) < !}• (23) 

By definition, if the fluid model at time t is in Aij, i.e., x{t) G Aj j, then dij{x{t)) = 0. However, 
if dij{x{t)) = 0, then x{t) is not necessarily in Ajj, because the FTSP Dij{x{t), ■) may be transient 
(drift to +00 or —00) or null recurrent; in particular. The evolution of the fluid model is determined by the 
distributional characteristics of the FTSP's Di 2 and -D2,i- Hence, even before we try to compute iTij{x{t)), 
which is necessary in order to solve the ODE (|20] |. there is a need to determine whether x{t) is in one of the 
sets Ai_2 or A2,i. We focus on 1)1,2, with the analysis of D2,i being similar. 
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To determined the behavior of the FTSP Di^2 it is again helpful to think of x as a fluid limit of the fluid- 
scaled sequence {X" : n > 1} and to recall that Di 2 was achieved as a limit of Z)"2 without any scaling; 
see ( fTTl) . (See also Theorem 4.4 in 1281 which provides a process-level limit relating D12 and Hence, 
both processes are defined on the same state space, which, for ri^2 = 1, is Z = {. . . , —1, 0, 1, . . . }. 

Now, for a fixed x{t), when Di_2(x(t), •) = m > 0, the birth and death rates of the FTSP are, respec- 
tively, 

A+(x(t),m) = Ai(t) + 02^2(0, 

fi'^{x{t),m) = A2(t) + + m,2^i,2(i) + ^2,1^2,l(*) + ^2,2^2,2(i) + OiQiit). 

In analogy to the (non-Markov) process = Qi ~ Q2 ~ ^12' corresponds to an increase 

of Di^2 due to arrival to queue 1 plus an abandonment from queue 2 (since either one of these two events 
cause an increase by 1 of D^2 stochastic system). Since any other event causes to decrease by 

1, due to the scheduling rules of FQR-ART, we get the expression for m). 
Next, if Di^2{x{t),m) = m < 0, the birth and death rates are, respectively, 

\~{x{t),m) = Ai + H2,2Z2,2{t) + Hl,2Zl,2{t) + 62q2{t), 

fj.^{x{t),m) = A2 + /ii,i2;i,i(t) + H2,iZ2,iit) + 9iqi{t). 

Again, whenever 0^2 is non-positive and sharing is taking place with pool 2 helping class 1, a "birth" 
occurs if there is an arrival to queue 1 or an abandonment from queue 2, or if there is a service completion in 
pool 2 (since then a newly available type-2 agent takes his next customer from queue 2). Similarly, a "death" 
occurs if there is an arrival to class 2, an abandonment from queue 1, or a service completion in pool 1. 

We see that the FTSP Di^2{x{t), •) is a two-sided M/M/l queue, i.e., it behaves like an M/M/l queue 
with "arrival rate" A+(x(t), m) and "service rate" fi^ {x{t),m) for all m > 0, and behaves like a different 
M/M/l queue with "arrival rate" fx^ {x{t),in) and "service rate" A^(x(t), m), for all m < 0. Thus, for 

5+(^)^A+(7,-)-/^+(7,-) and ^-(7) = A-(7, •) - ^'(t, •), 1^^1,2, 
the set Ai 2 can be characterized via 

Ai,2 = {7 G Ki,2 : 5+(7) < < 5- (7)}. 

Next, letting r+(7) and T^{^) denote, respectively, the busy period of the M/M/l in the positive region 
and the busy period of the M/M/l in the negative region, and using simple alternating renewal arguments 
for the renewal process 1)1^2(7, •)> we have 

E[r+(7)] 
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where, from basic M/M/1 theory, 



1 



±(7)-A±(7)- 



Note that if ^1,2(7) = but 7 ^ Ai 2, then vri 2(7) is equal to either 1 or 0. In particular. 



if (5+ (7) > 0, then 7ri,2(7) = 1 and if 5 (7) < then 7ri,2(7) = 0- 



(25) 



There are no other options, since for any 7 = x{t) for which both pools are full (as is required for the ODE 
(I20I 1 to be valid), it holds that 



where the inequality above follows from the fact that 2^1,2 (0 + ^2,2(0 = ^2(i) > 0. 

We see that the sets Aj j and the computation of vTj ,, (•) are completely determined by the staffing, arrival 
rates, service and abandonment rates for any given point 7 G Mg, where the only points that require careful 
analysis are those in one of the two sets Bj However, if the arrival rates or the staffing functions are 
time dependent, then the distribution of the FTSP Dij{x{t), •) is also time dependent. In particular, given a 
7 G Mg we cannot determine whether -01,2(7, ") positive recurrent or not, since that may depend on the 
time t E [0, T]. Thus, the sets at which the FTSP's are ergodic are themselves time dependent, and we need 
to consider sets of the form {Ajj (t) : t G [0, T]}, where 



where <5"'"(7, t) and 6~{'-f, t) are the drifts of the FTSP -Di,2(7; •) at the point 7 at time t. 

Fortunately, for the purpose of solving the ODE, we do not actually need to characterize the sets 
{Ajj(t) : t G [0,T]}, because we can determine whether Dij{x{t), •) is ergodic at each time t as we 
solve the ODE. 

6.1 A Numerical Algorithm to Solve the ODE 

We can use the analysis in ^to numerically solve the ODE (l20l ). starting at a given initial condition x(0), 
since we can now determine the value of I[i j{x{t)) for each t > 0. For example, if at a time t > 
di,2{x{t)) = 0, then we check whether (|26l ) holds, so that x{t) € Ai,2(t). If -22,1 (*) < T2,i, then 
^i,2{x{t)) = iTi^2{x{t)) and it can be computed using (l24l) . If -Z2,i(i) > T2,i, then 112, = 0. If 
di,2{x{t)) = but x{t) ^ Ai,2(t), i.e., if (l26l ) does not hold, then we can determine the value of 7ri,2(x(t)), 
and thus of ni,2(x(t)), by computing the drifts of the FTSP and employing (1251 ) (replacing the drifts in 
( |25] ) with the time dependent drifts as in (|26l)). Similarly we can compute the value of Il2,i{x{t)) whenever 
d2,i{^{t)) = 0. 

In all other regions of the state space for which both pools are full, i.e., Zij{t) + Zj,i{t) = mj{t), i / j, 
we can easily determine the value of TTi^2{x{t)) by considering whether di,j{x{t)) is bigger or smaller than 0. 



6-{x{t)) - 6+{x{t)) = 2(;Ui,2^1,2(t) + /i2,2^2,2(i)) > 0, 



(26) 



23 



For example, if at time t > di^2{x{t)) > 0,themTi^2{x{t)) = 1 and if fii^2(a;(t)) < 0, then 7ri_2(x(t)) = 0. 
This, together with the value of Z2,i{t), immediately gives the value of Hi 2(x(t)). 

We need to use other fluid equations when at least one of the two pools is not full. If, for example 

zi,i{t) + Z2,i{t) < mi{t), then necessarily qi{t) = < ki^2, so that 

zi,2{t) = -/ii,22;i,2(i) and = Ai - ^i,i2;i,i(t). 

The evolution of Z2,i in this case is determined by whether q2{t) < /c2,i or q2{t) > A;2,i. In the first case 
^2,1 (t) must be strictly decreasing at time t if it is positive, or remain at otherwise. In the latter case, when 
Q2it) > A;2,i, the excess fluid - that is not routed to pool 2 and does not abandon, if such excess fluid exists - 
is flowing to pool 1. We thus have 

i2i{t) = i -"'''''''^'^ if<Z2(t)<^i ^27) 

[ -/i2,1^2,l(0 + (^^2 - ^2,2^2,2(0 " ^,2^1,2(0 " 6*2^1) if q2{t) = /C2,l 

Similar reasonings lead to the fluid model of 2 when pool 1 is full, but pool 2 has spare capacity. 
If both pools have spare capacity at time t, then qi{t) = q2{t) and 

= -fJ'i,jZi,j{t) and Zi^i{t) = Xi - iii,iZi^i{t), ij = 1, 2, i / j. 

To compute the solution x over an interval [0, T] we employ the classical Euler method. Given a step 
size h and the time T, the number of iterations needed is = T/h. Let x = ^{x), where is the 
RHS of the appropriate ODE, e.g., if both pools are full, then ^'(2;) is the RHS of ( [20l) . Given x{0), we can 
compute x{h) using the first Euler step: x{h) = x{0) + h^{x{0)). Given x{h) we can compute Ili^2{x{h)) 
and Il2,i{x{h)), if needed, and then compute x{2h) using the second Euler step. In general, the solution to 
the ODE is computed via 

x{{k + l)h) = x{kh) + h^i{x{kh)), 0<k<N, 

where at each step, if x{kh) £ Bi 2 or x{kh) G 182,1, we can compute Iii^2{kh) and Xl2,i{kh) as explained 
above. 

The algorithm just described remains unchanged when the ratio parameters are general (not equal to 1), 
except that the sets Aj and the computations of vTj j are more complicated (the FTSP's are no longer BD 
processes). We refer to lITTI for these more complicated settings. 

Remark 6.1. If at iteration A; > the solution lies outside the set Bi,2 U ©2,1 then, due to the discreetness of 
the algorithm, there is a need to ensure that the boundary is not missed in the following iterations. Hence, if 
in the A;*'' iteration di,2(x(A;/i)) > 0(< 0) and in the (A;+l)'** iteration di,2((x(A;+l)/i)) < (> 0), then the 
boundary di^2 necessarily was missed, because the fluid is continuous, and so we set di^2{{x{k + 1)^)) = 0. 
We then check whether x{{k+l)h) G Ai,2((A;+l)/i), compute 7ri,2(x(/c + l)/i) and use its value to compute 
the value in the {k + 2)"*^ iteration. It is significant that we do not force the solution to be on the boundary, 
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e.g., we do not compute qi{{k + l)h) and use its value to compute q2{{k + l)h) via 

q2i{k + l)h) = q,{{k + l)h) - h,2. (28) 

We solve the six-dimensional ODE in (l20l ). and if indeed (l28l) holds whenever it should, then we have a 
good indication that the algorithm works. That is, we can check at which iteration the boundary Bi 2 was 
hit, and then observe if qi{t) — q2{t) = A;i 2 over an interval for which we have indication that this should 
hold. (Of course, the solution to the algorithm might leave the boundary for legitimate reasons, i.e., because 
the fluid model leaves it.) 

7 Numerical Examples 

We now study three examples. The first two are piecewise continuous models, whereas the third is for 
a general time-varying model. In all three examples the system starts empty, so that we also check the 
numerical algorithm in periods when (1T9] | does not hold, as in ^5.41 

We compare the numerical solutions to the ODE to simulations, to see how well the fluid approximated 
stochastic systems. In the first two examples we simulate three systems, each can be considered as a com- 
ponent in a sequence {X^ : n > 1}. In the smallest system we take 50 agents in each service pool, in the 
middle one there are 100 agents in a pool, and the largest has 400 agents in each pool, i.e., we simulate 
for n = 50, 100, 400. That allows us to observe the "convergence" of the stochastic system to the fluid ap- 
proximation. We plot the fluid and simulation results together, normalized to n = 10. (E.g., for the system 
with 400 agents in each pool we divide all processes by 40.) 

The following parameters aie used for all three simulations : 

= /^2,2 = 1; = /"2,i = 0.8, 6\ = 62 = 0.5. In addition, we take ri 2 = ^2^1 = 1. We take 
^",2 = ^2,1 = 0-3n; n,2 = -r2,i = 0.02n, so that, for n = 50, 100, 400, we have fcf 2 = ^2,1 = 15, 30, 120 
and ri,2 = T2,i = 1, 2, 8, respectively. 

7.1 A Single Overload Incident 

The first example aims to check whether FQR-ART detects overloads automatically when they occur and 
starts sharing in the right direction, and whether, once an overload incident is over, FQR-ART avoids oscil- 
lations. In particular, over the time interval [0, 60] the arrival rates are as follows: = n throughout that 
time interval. Over [0, 20) and [40, 60] the arrival rate to pool 1 is A" = n. Hence, both pools are normally 
loaded during these two subintervals. However, during the interval [20, 40) the arrival rate of class 1 changes 
to A" = 1.4n, so that, during [20, 40) the system is overloaded, and pool 2 should be helping class 1. 

We compare the solution to the fluid equations, solved using the algorithm, to an average of 1000 
independent simulation runs for the three cases n = 50, 100, 400. The results are shown in Figures [lO]- 
[T2]below. In addition Figure [T3]plots qi — ri 292 — ^1,2- The fact that shortly after time 20 the value is 0, is 
a strong indication that the numerical solution is correct, because during most of the overload period, when 
sharing takes place, it should hold that di^2{x{t)) = 0. 

The simulation experiments indicate that the fluid model approximates well the mean behavior of the 
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Figure 12: Fluid vs. simulations of Zi,2 Figure 13: plot of qi - ri^2q2 - ki,2 

system even for relatively small systems, e.g., when n = 50. Of course, the accuracy of the approximation 
grows as n becomes larger. The simulation experiments show that FQR-ART quickly detects the overload 
and the correct direction of sharing. Moreover, the control ensures that there are no oscillations, as in ^ 

Another observation is that when the system is normally loaded and there is no sharing, the fluid model, 
which has null queues, does not describe the queues well. In those cases there is an increased importance 
to stochastic refinements for the queues. If there is only negligible sharing, as FQR-ART ensures, then such 
stochastic refinements are well approximated by diffusion limits for the Erlang A model, as in |14J. 

7.2 Switching Overloads 

In the second example we consider an overloaded system, with pool 1 being overloaded initially, and with 
the direction of overload switching after some time, making pool 2 overloaded. Specifically, we let the 
arrival rates be A" = 1.4n and = n over [0,20), and A" = n, Aj = 1.4n on [20,40]. The results are 
plotted in Figures [T4][T6l 

Figure [TTlplots 2(72 — ^1,2 and ^2^192— (Zi— ^2,1- Once again, the fact that the appropriate difference 

process equals to shortly after the corresponding overload begins is an indication that the solution to the 
ODE is correct, since each queue is calculated via the AP, without forcing the relations (ii 2(a:(t)) = and 
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d2,l{x{t)) = 0. 




Figure 14: Fluid vs. simulations of Figure 15: Fluid vs. simulations of Q2 

As in the figures in ^7.1[ it is easily seen from the figures above that the fluid model approaches a fixed 
point, so long as the arrival rates are fixed. Then, once a change in the rates occurs, the fluid goes through a 
new transient period until it relaxes in a new fixed point. We elaborate in ^below. 

7.3 General Non-stationary Model with switching overloads 

We next test our algorithm in challenging example. That example is unrealistic, since the arrival rates 
and staffing functions are not likely to change as drastically as in that example (at least in call centers). 
We assume that the arrival rate to pool 1 over the time period [0, 20) is sinusoidal. We further assume that 
management anticipated the basic sinusoidal pattern of the arrival rate, but did not anticipated the magnitude, 
so that pool 1 is overloaded. To accommodate the sinusoidal pattern, we assume that staffing follows the 
appropriate infinite-server approximation; see, e.g.. Equation (9) in lITTI . The purpose of that staffing rule in 
our setting, is to stabilize the system at a fixed point eventually, as in the examples above. In particular, for 
t e [20, 40] we take 
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X1{t) = 1.3n + O.lnsin(t) and m1{t) = n + 0.05n[sin(t) - cos(t)]; \^{t) = n and m^(t) = n. 

Then, on the time interval [20, 40] the overload switches, with pool 2 becoming overloaded and experi- 
encing a sinusoidal arrival rate. However, we now take fixed staffing in both service pools. In particular, the 
parameters over the second overload interval [20, 40] is 

X'lit) = n and m^(t) = n; X^{t) = l.ln + O.lnsin(t) and m'!^{t) = n. 
Thus, we test two overload settings in this example. In the first interval, we can see whether the fluid 
approximation stabilizes. Since there is sharing of class- 1 customers, previous results in the literature (see 
the review in ^1.21 ) do not extend directly to our case. In the second interval, we expect to see a sinusoidal 
behavior of the system, because the staffing in both pools is fixed. In particular, the fluid model should not 
approach a fixed point after the switch at time t = 20. 

We now simulate the case n = 100 and n = 400 to make the figures clear, since in the second period. 
Fi gures [T8l - l2T] demonstrate the effectiveness of the fluid model and the numerical algorithm. First, Figure [T8] 
shows that the control stabilizes the two queues at the appropriate relation, and that the algorithm captures 
that well. (We emphasize again that each queue is computed independently via the AP.) As expected, the 
fluid over [0, 20) approaches a fixed point, and exhibits a sinusoidal behavior after t = 20, with the accuracy 
of the approximation growing together with the size of the system. 




Figure 18: Fluid vs. simulation, di,2 and pigure 19: Fluid vs. simulations of Z^^ 

d2,i and Z^^^ 

Removing Agents When the Staffing function is Decreasing. In applications, there is a need to deter- 
mine how to remove agents from the pool when the staffing function is decreasing, when there are not 
enough idle agents and there is a need to remove busy agents in order for the actual number of agents to be 
close to the staffing function. (Note that, since the staffing functions are assumed to be continuous almost 
everywhere, we cannot have the actual staffing be exactly equal to the staffing function, since the number of 
agents in the stochastic system is an integer and changes by ±1.) In a call-center, it is unlikely that agents 
be removed while still serving customers. Hence, in our simulations an agent is removed from the pool 
only if he is idle, or immediately after he completes a service. Specifically, at each service completion we 
check wether there are "too many" agents in the pool (compared to the ceiling of of the staffing function: 
[m"(i)]), and only then the newly-available agent is removed from the pool, if needed. Figure |22] shows 
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Figure 20: Fluid vs. simulations of Q\ 
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Figure 21: Fluid vs. simulations of Q2 



the actual number of agents in Pool 1 for the case n = 100 (the average of the 1000 simulations), and the 
staffing function m^{t) given above. Clearly, the staffing rule just explained provides an actual staffing that 
is very close the target one. Moreover, at time t = 20 the number of agents has a downward jump, and we 
see that the staffing rule we employed mimics this jump, although, unhke in the fluid approximation, the 
actual staffing can only jump down by 1 at a time. This behavior is to be expected, since there are many 
service completions over short time intervals in large systems. 



8 Stationarity and Quasistationarity 

The fluid model is said to be stationary if it is fixed at a single point x* £ Mg. and x* is said to be a stationary 
point. In other words, x* is a stationary point if x{s) = x* for some s > 0, implies that x{t) = x* for all 
t > s or, equivalently, x{t) = 0, for all t > s. Hence, a stationary point is also a fixed point of the fluid 
model. 

For each fixed arrival rates and staffing function there exists a unique stationary point. This was proved 
in §8 of |[27l for the case in which class 1 is overloaded and one-way sharing is performed with pool 1 
helping. The exact same reasonings can be used to prove this statement when class 2 is overloaded with 




Figure 22: Fluid vs. simulations: Number of agents in pool 1 
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one way sharing. If both classes are normally loaded or underloaded and there is no sharing in the fluid 
approximation, then this claim follows easily from known results for the Erlang A model. 

The models we consider here are nonstationary. However, as long as the arrival rates and staffing func- 
tions remain fixed, or if the staffing changes appropriately with respect to the arrival pattern, as in Period 1 
of the example in ^7.31 then there exists a point that would have been a stationary point, if the rates of the 
system would never change. We call such points quasistationary points. 

In the stationary model considered in our previous papers, we have proved that the unique stationary 
point of the fluid model is exponentially stable, i.e., that the fluid model converges to that points exponen- 
tially fast, for any initial condition x(0) which has sharing in the appropriate direction; see Theorem 9.2 in 
fTTl . It stands to reason that a similar result holds for any quasistationary point, provided they exist but, 
unUke in 1271 . here this should hold for any initial condition. We do not prove this assertion, but observe 
that the simulation experiments in ^ ^7.1| and |7.2[ as well as in the first period of the example in ^7.3[ suggest 
that this claim holds, at least in many reasonable examples. 

9 Conclusions and Further Research 

In this paper we studied a time-varying X model experiencing periods of overloads. We observed that our 
previous control, FQR-T, which was designed to respond to unexpected overloads, needs to be adjusted to 
the time-varying environment, in order to respond quickly to changed in the direction of the overload and 
then prevent behavior that FQR-T may cause in these more complex settings. We thus suggested using 
a modification of the control, which we named FQR-ART. With FQR-ART, the one-way sharing rule is 
weakened by adding the release thresholds. To avoid oscillations of the service process, which in turn 
causes congestion collapse, we further replaced the activation thresholds of FQR-T, that were suggested to 
be as in ([5]), with thresholds that are of order n, i.e., satisfying We then developed a fluid model, based on 
a stochastic AP, to approximate the transient evolution of the time- varying system under FQR-ART, together 
with an algorithm to numerically compute that fluid model. Simulation experiments suggest that this fluid 
model captures the main dynamics of the system, even in extreme cases, as the one considered in (17.31 ). 

Future Research. As the literature review show, oscillatory behavior and congested collapse are important 
phenomenons in other stochastic networks. Our methods and insights here could potentially have impact in 
other applications as well. However, it remains to show that FQR-ART indeed prevents the bad behavior 
from occurring. Specifically, it remains to prove that the ODE in (l20l) posses a unique solution and that, at 
least under some reasonable regularity conditions, oscillatory behavior is prevented. 

Due to the importance of piecewise constant models, it will be useful to prove that for each of the 
intervals of fixed rates and staffing, and regardless of the starting point of the fluid model at any such 
interval, there is convergence to the appropriate quasistationary point. 

While we view the above research questions to be the most fundamental from the mathematical point of 
view, it is also beneficial to prove that the fluid model arises as the many-server heavy-traffic fluid limit of a 
properly-scaled sequence of X systems operating under FQR-ART. (Note that such a proof will also imply 
that the ODE (l20l) has a unique solution.) 



30 



References 

[1] Aksin, Z., Armony, M., Mehrotra, V., 2007. The modern call center: a multi-disciplinary perspective 
on operations management research. Production Open Management, 16(6) 655-688. 

[2] Armony, M., Israelit, S., Mandelbaum, A., Marmor, Y., Tseytlin, Y., Yom-tov, G., 2010. Patient Flow 
in Hospitals: A Data-Based Queueing-Science Perspective. Working paper 

[3] Boyle, A., Beniuk, K., Higginson, I., Atkinson, P., 2012. Emergency department crowding: Time for 
interventions and policy evaluations. Emergency Medicine International. 

[4] Choudhury G. L., Leung, K. K., Whitt, W., 1995. Efficiently Providing Multiple Grades of Service 
with Protection Against Overloads in Shared Resources. AT&T Technical Journal, 74 (4), 50-63. 

[5] Chan, C. W., Armony, M, Bambos, N., 2011. Fairness in overloaded parallel queues. Working paper 

[6] Chan, C. W., Yom-Tov, G., Escobar, G., 2011. When to use Speedup: An Examination of Intensive 
Care Units with Readmissions. Working paper 

[7] J. G. Dai, 1995. On positive Harris recurrence of multiclass queueing networks: a unified approach via 
fluid Umit models. Annals of Applied Probability, 5 (1), 49-77. 

[8] Deo, S., Gurvich, 1., 2011. Centralized vs. Decentralized Ambulance Diversion: A Network Perspec- 
tive. Mangement Sci. 57 (7), 1300-1319. 

[9] Doshi, B., Heffes, H., 1986. Overload performance of several processor queueing disciplines for the 
M/M/1 queue. IEEE Transactions on Communications, 34 (6), 538-546. 

[10] ErramilU, A., Forys, L.J., 1991. Oscillations and chaos in a flow model of a switching system. IEEE 
journal on Selected Areas in Communications, 9 (2), 171-178. 

[11] Feldman, Z., Mandelbaum, A., Massey, W. A., Whitt, W. 2008. Staffing of time-varying queues to 
achieve time-stable performance. Management Sci. 54 (2), 324—338. 

[12] Forys, L.J. and Im, C.S., Henderson, W., 1988. Analysis of Load Box Testing for Voice Switches. 
ITC-12, Torino. 

[13] Gans, N., Koole, G., Mandelbaum, A., 2003. Telephone call centers: tutorial, review and research 
prospects. Manufacturing and Service Opns. Mgmt., 5, 79-141. 

[14] Gamett, O., Mandelbaum, A., Reiman, M. I., 2002. Designing a call center with impatient customers. 
Manufacturing Service Oper. Management, 4 (3), 208-227. 

[15] Gurvich, I., Perry, O., 2012. Overflow networks: Approximations and impUcations to call-center out- 
sourcing. Operations Research, 60 (4), 996-1009. 

[16] Harrison J. M., Zeevi., A., 2005. A method for staffing large call centers using stochastic fluid models. 
Manufacturing Service Oper. Management, 1 (1), 20-36. 



31 



[17] Jennings, O.B., Mandelbaum, A., Massey, W.A., Whitt, W., 1996. Server staffing to meet time-varying 
demand. Mangement Sci. 42 (10), 1383-1394. 

[18] Khalil, H. K., 2002. Nonlinear Systems. Prentice Hall, New Jersey. 

[19] Klemm, F. Le Boudec, J.Y., Aberer, K., 2006. Congestion control for distributed hash tables. In: Fifth 
IEEE International Symposium on Network Computing and Applications. 

[20] Komer U., 1991. Overload control of SPC systems. Teletraffic and data traffic 

[21] Liu, Y., Whitt, W., 2011. Large-Time Asymptotics for the Gt/Mt/st + Git Many-Server Fluid Queue 
with Abandonment. Queueing Syst, 67, 145-182. 

[22] Liu, Y., Whitt, W., 2012a. Stabilizing Customer Abandonment in Many-Server Queues with Time- 
Varying Arrivals. Operations Research, forthcoming. 

[23] Liu, Y, Whitt, W., 2012b. A Many-Server Fluid Limit for the Gt/GI/st + GI Queueing Model 
Experiencing Periods of Overloading. Operations Research Letters, forthcoming. 

[24] Pang, G., Talreja, R., Whitt, W., 2007. Martingale proofs of many-server heavy-traffic limits for Marko- 
vian queues. Probability Surveys, 4, 193-267. 

[25] Perry, O., Whitt, W., 2009. Responding to unexpected overloads in large-scale service systems. Man- 
agement Sci., 55 (8), 1353-1367. 

[26] Perry, O., Whitt, W., 2011a. A fluid approximation for service systems respond- 
ing to unexpected overloads. Operations Res., 59 (5), 1159-1170. Available at: 
http://www.columbia.edu/~ww2040/allpapers.html 

[27] Perry, O., Whitt, W., 2011b. An ODE for an overloaded X model involving a stochastic averaging 
principle. Stochastic Systems, 1 (1), 17-66. 

[28] Perry, O., Whitt, W., 2012. A fluid limit for an overloaded X model via an av- 
eraging principle. Mathematics of Operations Research, forthcoming. Available at: 
[http://www .columbia.ed u/~ww2 040/allpape rs.html 

[29] Perry, O., Whitt W, 2012. Diffusion approximation for an overloaded X model via a stochastic aver- 
aging principle. Working paper. Available at: http://www.columbia.edu/~ww2040/allpapers.html, 

[30] Powell, E.S., Khare, R.K., Venkatesh, A.K., Van Roo, B.D., Adams, J.G., Reinhardt, G., 2012. The 
relationship between inpatient discharge timing and emergency department boarding. The Journal of 
emergency medicine, 42 (2), 186-196. 

[31] Schulzrinne, H., Kurose, J.F, Towsley, D., 1990. Congestion control for real-time traffic in high-speed 
networks. IEEE proceeding in Ninth Annual Joint Conference of the IEEE Computer and Communi- 
cation Societies, 543-550. 



32 



[32] Shah, D., Wischik, D., 2011. Fluid models of congestion collapse in overloaded switched networks. 
Queueing Syst, 69, 121-143. 



[33] Shi, P., Chou, M., Dai, J. G., Ding, D., Sim, J., 2012. Hospital Inpatient Operations: Mathematical 
Models and Managerial Insights. Working paper 

[34] Stolyar, A. L., Tezcan, T., 2010. Contorl of systems with flexible multi-server pools: a shadow routing 
approach. Queueing syst 66 1-51. 

[35] Stolyar, A. L., Tezcan, T., 2011. Shadow-routing based control of flexible multiserver pools in over- 
load. Operations Research 59 (6), 1427-1444. 

[36] Teschl, G., 2009. Ordinary Differential Equations and Dynamical Systems, Universitat Wien. Available 
online: www.mat.univie.ac.at/~gerald/ftp/book-ode/ode.pdf 

[37] Whitt, W., 2002. Stochastic-Process Limits, New York, Springer, 2002. 

[38] Whitt, W., 2004. Efficiency-driven heavy-traffic approximations for many-server queues with aban- 
donments. Management Sci. 50 (10) 1449-1461. 

[39] Whitt, W., 2006. Fluid Models for Multiserver Queues with Abandonments. Operations Research, 54 
(1) 37-54. 

A Choosing the Activation Thresholds 

For an example, we consider the case of recovering from an overload period with pool-1 helping queue 2. 
We assume that 2^2, i(0) > but Aj < fii^inn, i = 1,2. If Ai = imi, then the condition 2^2, i(0) > 
implies that pool 1 is overloaded, since Z2,i{t) > for all t > 0. The maximal service rate in pool 1 over 
time is 



where we assume that type-1 agents are less efficient serving class-2 customers, i.e., that /Lt2,i < 
Our goal is to choose the thresholds on the queues and the release thresholds such that qi{T) < /ci 2, for 
T = inf{t > : Z2,i{t) = T2,i}. Since the arrival rate to pool 2 is such that that pool is not overloaded, 
queue 2 quickly decreases after the arrival rates have changed back to normal, so that no more class-2 
customers are sent to pool 1. Hence, Z2,i satisfies the ODE Z2,iit) = —iJ.2,iZ2,i{t), t > 0, whose unique 
solution, given the initial condition, is 



~ ^l,2(*)) + IJ'2,lZ2,l{t) = Ai - - fl2,l)z2,lit) < Ai, t> 0, 



^2,i(t) = 2^2,1 (O)e-'^^'^*, 



te[0,T], 



(29) 



Recalling that Ai = //i,imi, the fluid model of qi{t) over t G [0, T] satisfies the ODE 



qi{t) = Ai - /ii,i(mi - Z2,i{t)) - H2,iZ2,i{t) - 9iqi{t) 
= - M2,i)^2,i(0 - diqiit), t G [0,r]. 



(30) 
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It follows from ^ and dSOjl that 
so that, fort e [0,T], 

q,{t) = (^^.i-/^i^i)^^^i(0)e-(M..+^O^ + L (0) - Ki-m,i)^2.i(0)\ ^3^^ 

Since we cannot know a-priori what values qi{0) and 2:2,1(0) will have, we use the upper bounds Xi/Oi 
and mi, respectively. The bound for qi{0) is achieved by considering all the type-1 servers working with 
class 2, so that none of the class- 1 customers receives any help, and the only output from the queue is due 
to abandonment. 

Taking our example in the beginning of the section with ^2,1(0) < mi < 1 and qi{0) < Xi/Oi =2, 

r2,i = ^2,i(T) < e-°-5^ and qi{T) < + e-^^^ . 

We first choose T2^i to be O.Olmi (if our system has 100 agents in pool 1, then we might choose the release 
threshold to be a bit larger, e.g., O.OSmi). Then by time T = 9.21 the thi^eshold must have been crossed 
(in the fluid limit). Plugging T = 9.21 in dSB gives gi(9.21) < 1.01. We thus take k^^ > 0.066n. 
For example, for n = 1000, we should have k2i > 101. However, taking into consideration stochastic 
fluctuations, which are of order ^/n in normally loaded systems (operating in the QED regime) we see 
that k2,i = 101 may be too small. Recall that we considered kl^^ = 150 as an appropriate threshold for 
the purpose of preventing sharing when the two pools are normally loaded, and for detecting unexpected 
overloads. The above analysis shows that this value for fcg ^ is appropriate also from fluid considerations. 
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Proprtions Serving Others In Both Pools 
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